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Abstract — We consider an additive Gaussian channel with 
additive Gaussian noise feedback. We first derive an upper 
bound on the n-block capacity (defined by Cover [1]). It is 
shown that this upper bound can be obtained by solving a 
convex optimization problem. With stationarity assumptions on 
Gaussian noise processes, we characterize the limit of the n- 
block upper bound and prove that this limit is the upper bound 
of the noisy feedback (shannon) capacity. 

Index Terms — Capacity, Gaussian channels with noisy feedback, 
convex optimization, stationarity. 
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I. Introduction 

We consider a time-varying additive Gaussian channel 
with time-varying additive Gaussian feedback. See Fig[T] M 
is a message index where M G {1,2,3, ■ ■ ■ ,2"*}. The additive 
Gaussian channel is modeled as 

Yi=Xi + Wi i = l,2,-- 

where the gaussian noise {W/}?1 satisfies W" ~ N„(0,K W) „) 
for all n G Z + . Similarly, the additive Gaussian feedback is 
modeled as 

Zi = Y t + Vi i=l,2,— 

where the gaussian noise {V;}°1 satisfies V" ~ N„(0,K v ,„) 
for all n G Z + . Noise V and W are assumed to be indepen- 
dent. Notice that we have not assumed stationarity on W and 
V. The channel input Xj is generated based on M and Z , 
satisfying 

i ^EXf(M,Z' -1 ) < P. 

Since the capacity of this noisy feedback Gaussian channel is 
difficult to characterize, we wish to find a tight upper bound 
on the capacity in this paper. 

In retrospect, additive Gaussian channels have been stud- 
ied since the birth of "Information Theory". When there is 
no feedback (i.e. Z, = for all i), the channel input Xj is 
independent of the previous channel outputs. The n-block 
capacity is characterized in [1] as 

1 det(K,,,„+K,.„) 
C„ = max — log — — 

tr(K XJ ,)<nP 2n detK, t .„ 

where the maximum is taken over all positive semidefinite 
matrices K x „. Here, the n-block capacity can be thought of 

as the capacity in bits per transmission if the channel is to 
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Fig. 1. Gaussian channels with additive Gaussian noise feedback 



be used for the time block {1,2, • • • ,n} [1]. If we assume the 
stationarity on the process {W,}~L , it is well-known that the 
nonfeedback (Shannon) capacity is characterized by water- 
filling on the noise power spectrum. Specifically, 

lr> max{S (e J),A } 

where § v , ,(e' e ) is the power spectrum density of the stationary 
noise process {W,}^ . The water level X should satisfy 

— r max{0,A -$Je ie )}dd = P. 

Note that the initial idea of water-filling comes from Shannon 
[2]. When there is a perfect feedback (i.e. Z, = Yj for all i), 
the n-block feedback capacity is notably characterized in [1] 

as 

1 det((I„+B„)K vl ,,„(I n +B„) r +K, in ) 



C ^ = BX,2^° g - 



detK„,,„ 



where the maximum is taken over all positive semidefinite 
matrices K jn and all strictly lower triangular matrices B„ 
satisfying 

tr(K s , n + B n K W:n B^) <nP 

Similar to the nonfeedback case, if we assume the station- 
arity on the process {Wi}"!^, the perfect feedback (Shannon) 
capacity is characterized in [3] as 

r 1 f\ ^(e w ) + \l+W(e ie )\ 2 § w (e ie ) lQ 
C fb = sup— log dO. 

§ s fi47cJ-n §w(e w ) 

(1) 

with power constraint 

i- T S s (e w ) + \M(e' e )\ 2 § n ,(e' e )d9 < P. (2) 
2n J-k 



Here M(e' e ) represents all possible strictly causal linear 
filters. When there is an additive Gaussian noise feedback 
as shown in FigQ] no characterization on the capacity Oi™ se 
has been developed yet, to the present author's knowledge. 
So far, only few papers have addressed this problem or its 
variations. [4] and [5] take Cover-Pombra scheme for the 
noisy feedback case (colored Gaussian noise) and derive the 
upper and lower bounds on its maximal achievable rate. 
Other works focus on the additive white Gaussian noise 
(AWGN) channel with AWGN feedback. For example, [6] 
derives the upper and lower bounds on the reliability function 
and shows that, noise in the feedback link renders the noisy 
feedback communication fundamentally different from the 
perfect feedback case. [7], [8] and [9] propose specific 
coding/decoding schemes based on the notable Schalkwijk- 
Kailath Scheme [10]. 

In this paper, we derive upper bounds on the n-block noisy 
feedback capacity and the noisy feedback (shannon) capacity, 
respectively. It is shown that the problem of computing the 
derived upper bound on the n-block capacity can be trans- 
formed into a convex form, which can be solved efficiently 
by standard technical tools. 

Notations: Uppercase and corresponding lowercase let- 
ters (e.g.Y,Z,y,z) denote random variables and realizations, 
respectively, x" represents the vector [x\,X2,- ■ ■ ,x„] T and 
x° = 0. I„ represents an n x n identity matrix. K„ > (K„ > 0) 
denotes that the « x n matrix K„ is positive definite (semi- 
definite), log denotes the logarithm base 2 and OlogO = 0. 
The expectation operator over X is presented as E(X). 

II. Preliminaries 

In this section, we review some definitions and Lemmas 
in information theory. 

Definition 1: [11] The mutual information I(X\Y) be- 
tween two random variables with joint density f(x,y) is 
defined as 

I(X-Y) = [ /fry) log / f \'j\ dxdy 

, N J f( x )f(y) 

Let h(X) denote the differential entropy of a random 
variable X. Then it is clear that 

I(X;Y)=h(Y)-h(Y\X) 

We recall a useful Lemma as follows. 

Lemma 1: [11] Let the random vector Xel" have zero 
mean and covariance K v ,„ = EXX T (i.e. Ky = KXjXj, 1 < 
hj < «)■ Then 

h{X) < -log(2^e) n detK^„ 

with equality if and only if X ^ N(Q,K XJ1 ). 

Next, we present the definition of Directed Information 
given by Massey [12]. 

Definition 2: The directed information from a sequence 
X" to a sequence Y" is defined by 

I(X" ^Y") = f^I(X i ;Y,\Y i - 1 ). 

z=l 



We would like to remark that Massey's definition of directed 
information implicitly restricts the time ordering of random 
variables (X n ,Y n ) as follows 

Xi,Yi,X 2 ,Y 2 ,--- ,Xn,Y n . (3) 

So we refer the interested readers to [13] for the definition 
of Directed Information for an arbitrary time ordering of 
random variables. We next define a channel code for com- 
munication channels with noisy feedback. 

Definition 3: (Channel Code) A (n,M,E n ) channel code 
over time horizon n consists of an index set {1,2,3, • • • ,M}, 
an encoding function e: {1,2,---,M} x iF" -1 -> 2K n , a 
decoding function g:2^" — > {1,2, •• • ,M} and an error prob- 
ability satisfying 

i M 

^£j>(w?W)M<$, 

w= 1 

where lim„^oo £„ = 0. 

We finally recall the Schur complement which will play 
an important role in the paper. 

Definition 4: [14] Consider an n x n symmetric matrix X 
partitioned as 



If detA ^ 0, the matrix 

S = C-B T A~ 1 B 

is called the Schur complement of A in X. 

We present some properties of the Schur complement as 

follows. 

1) detX = detAdetS. 

2) X > if and only if A > and S > 0. 

3) If A > 0, then X > if and only if S > 0. 

III. An Upper Bound on the N-block Capacity 

In this section, we first derive an upper bound on the n- 
block noisy feedback capacity (Theorem Q]). Without loss 
of generality, we characterize this upper bound as an op- 
timization problem by adopting the Cover-Pombra scheme 
(Theorem We then transform the optimization problem 
into a convex form (Corollary [TJ. The n-block noisy feedback 
capacity is defined as follows [1]. 

Definition 5: 

Cf^= max ~I(M;Y n ). (4) 

if(K Z ,„)<P" 

We now define a new quantity and then prove that it is an 
upper bound of the above n-block capacity. 
Definition 6: 

C n f f n e = max -I(X n -+Y n \V). (5) 
ktr(Kx^)<P n 
Theorem 1: For a given power constraint P, 

C noise ^ f^noise / c\ 

fb,n ^ ^fb,n ■ W 



(a) 



Proof: 
I{M;Y") 

--h{M)-h{M\Y n ) 

=h(M)-h(M\Y n ,V) - (h(M\Y") -h(M\Y n , V")) 

h{M\V")-h(M\Y n ,V")-I{M;V"\Y n ) 
:I(M;Y n \V n )-I(M;V n \Y") 
--h(Y"\V")-h{Y"\M,V')-I(M;V n \Y n ) 

= f)A(7 i |F , ~ 1 ,V")- h (Ji\Y i ~ 1 ,M,V n ) - I(M; V \Y" ) 



;=1 



(h). 



£ A (Fi | F'" 1 , V " ) - h (Y t | r~ 1 , M, V " , X') - 7(M; V" | F" ) 



(c) ■ 



£ /1 (F | F'- 1 , V ) - h (F | y ''- 1 , X'' , V" ) - 7(M; V " | F" ) 

j=i 

= £/(X'';F■|F''- 1 ,V")-/(M;V"|F' , ) 
--I(X n ^Y"\V") -I(M;V n \Y n ) 



(a) follows from the fact that M and V" are independent. 

(b) follows from the fact that X' can be determined by M 
and the outputs of the feedback link (i.e. F ,_1 + V'~ 1 ). (c) 
follows from the Markov chain M — (F !_1 ,X l , V")— F,-. 

Since the conditional mutual information I(M;V"\Y n ) > 0, 
we have 

I(M;Y")<I(X" ^Y"\V n ) 

The proof is completed. ■ 
Next, we characterize the above upper bound. First of 
all, we consider a scheme with linear encoding of the 
feedback signal and Gaussian signaling of the message 
(Cover-Pombra scheme) as shown in a vector form in Figj2] 

The channel input signal: X" = S" + B„ (W" + V") 
The channel output signal: Y" = S" + B„ (W" + V")+ W" 
The power constraint: fr(K Si „ + B„(K V) , V , +K,,.„)B^) < nP 
where S" ~ A^(0,K s ,„) is the message information vector 

and B„ is an n x n strictly lower triangular linear encoding 
matrix. Note that the one-step delay in the feedback link is 
captured by the structure of matrix B„. Random variables 
S n ,V n ,W" are automatically assumed to be independent. 

In the following, we prove that C"l' s n e can be characterized 
by the above coding scheme without losing the optimality. 

Theorem 2: CJ'^ can be obtained as the optimal objec- 
tive value of the following optimization problem. 

1, det((I n +B„)K W) „(I n +B„) r + K w ) 
maximize — log ■ 



2« detK„,„ 



subject to tr(K St „ + B„ (K,,„ + K H ,„)B„ ) < nP 

Ks.„ > B„ is strictly lower triangular 



x" =s"+B„(w" + v") 



M- 
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Fig. 2. Gaussian channels with additive Gaussian noise feedback(Gaussian 
signalling and linear feedback) 



Proof: First of all, we have 
I(X" -+Y n \V n ) 

=£7(x i ,F|F ! '- 1 ,y' ! ) 



r=l 



: £ h (F I Y'- 1 , V " ) - h (F I Y { - 1 , X'' , V " ) 



r=l 



(7) 



= £ h{Yi\Y l ~ l , V") - A(Jf|F i_1 ,X') 
/=i 

= £ ft (f |y ''- 1 , y " ) - h (x + W; | F i_1 ,x',w' -1 ) 
i=i 

= £ a (F | y ''- 1 , y " ) - A (wj | w' - 1 ) 
1=1 

=h(Y n \V n )-h(W n ) 
where (a) follows from the Markov chain V — (F' _1 ,X') — F,-. 

We next show that maximizing h(Y"\V") — h(W") over 
coding scheme as shown in Figf2]does not lose the optimality. 
Since we can not affect the noise entropy (i.e. h(W")), 
we need to maximize h(Y"\V") over all possible channel 
inputs {X}" =1 . To begin with, we present two insightful 
observations on the channel inputs X": 

1) y"|y" should be Gaussian distribution for maximizing 
h(Y"\V"). Since W" is Gaussian and F" =X"+W", 
X"\V" must be Gaussian. 

2) X" depends on W" + V" instead of W" and V" sep- 
arately since the channel outputs are fed back to the 
encoder without any encoding. 

Therefore, the most general normal causal dependence of X" 
on F" satisfying the above two observations is the form of 
X" = S" +B„(W" + V n ). Then we have 

h(Y"\V")-h(W") 

=h(s" +b„(w" +y") + w"\v") - h(w n ) 

=h(S" + (B n +l n )W\V n )-h(W n ) 
=h(S" + (B„+I„)W' ! ) - h(W) 



By Lemma Q] the proof is complete. 



Remark 1: The key idea of this theorem is to show 
C n f fn= maximize -I{X n -> Y n \V n ) 

' all coding schemes 11 

= maximize -I(X" ->Y n \V) 

Cover-Pombra scheme 11 

We would like to remark that the Cover-Pombra scheme may 
not be the optimal (capacity-achieving) coding scheme in the 
noisy feedback case. We herein adopt this coding scheme 
because it can nicely characterize the proposed upper bound. 
Definitely, the Cover-Pombra scheme may not apply if we 
look at a different upper bound. 

The following corollary shows that the optimization prob- 
lem (O can be transformed into a convex form. This result 
has been shown in [5]. For reader's convenience, we give the 
proof again in the Appendix. 

Corollary 1: C^" e can be obtained as the optimal objec- 
tive value of the following convex optimization problem. 

B„ H„ 

H„ I„ + 



maximize 
subject to 



1 

tr(H 



logdet 
K 



1 

27i 



logdet(K- I 1 K vv ,„) 



In + B„ 
B„ 



K .v,n 



K 



-l 



>0 



Of! 

B„ is strictly lower triangular 



IV. An Upper Bound on the Capacity 

As we have shown, the upper bound of the n-block 
capacity is numerically solvable due to its convex form 
(interior-point method). As for the Shannon (infinite-block) 
capacity, however, there is still much work to be done. The 
problem is that the formula (0 may not have a limit as 
n — > °°, due to the time-varying nature of the noises {Wi} 
and {Vi}. This is the main difficulty for us to develop an 
upper bound for the Shannon capacity. We herein handle 
this problem by assuming stationarity on noise {Wi} and 
{V,-}. In this section, we first show that under the stationarity 
assumption the limit of formula (|7]) exists and we develop 
its limit characterization. Then we prove that the limit 
characterization is the upper bound of the noisy feedback 
(Shannon) capacity. 

Theorem 3: Assume that {Wi} and {V,} are stationary 
processes. Then the limit of formula (0 exists and can be 
characterized as 



1 f\ S^ + ll+B^fSv^ 1 ' 9 ) 
su PjZ / lo 8 5 / ie\ dd 



(8) 



with power constraint 



— / E s (e ie ) + \M(e w )\ 2 (S w (e i9 )+Me ie ))d9 < P. (9) 

Here, S s (e ,e ), E> w (e' e ) and § v (e' e ) are the power spectral den- 
sity of {Si}, {Wi} and {V,-} respectively. B(e' e ) = £JLj b k e m 
is a strictly causal linear filter. 



The main idea of the proof is taken from [3]. We refer 
interested readers to the Appendix for the detail. Next, we 
show that the above limit characterization is the upper bound 
of the noisy feedback (Shannon) capacity. 

Theorem 4: Assume that {Wi} and {Vi} are stationary 
processes. Then C'}f e < Cf ise 



Proof: 



-fb 



C 



fb 

<limsup max -I(M;Y") 

w l 

< limsup max -I(X" -> Y"\V") 

n-+~ {x,}» =0 n 

lb),. 1 , det((I„+B n )K w . n (I„+B )1 ) 7 '+K Ji „) 

= hm sup max — log 

ji^oo B„,K,„2n detK vtv , 

-'-fb 

where (a) follows from Theorem Q] (b) follows from Theo- 
rem |2] and (c) follows from Theorem [3] ■ 
Remark 2: Compared with the perfect feedback capacity 
characterization (HJ and (0, feedback noise § v (e ie ) only 
affects the power allocation. If the noise in the feedback 
link increases (i.e. §,,(e' e ) grows large in some sense), 
the feedback benefit in increasing reliable transmission rate 
vanishes. That is, the noisy feedback system behaves like 
a nonfeedback system since, due to the power constraint, 
B(e' 9 ) approaches as § v (e ,e ) grows. 

V. Simulation Results 

In this section, we show some simulation results to gain 
insight on the capacity of Gaussian channels with noisy feed- 
back. The simulation results herein are taken from [5]. For 
reader's convenience, we re-present some simulation results 
here and give a brief discussion. We refer the interested 
readers to [5] for more simulation results. We assume that the 
forward channel is created by a first order moving average 
(lst-MV) Gaussian process. That is, 

Wi = Ui + aU t -i 

where Uj is a white Gaussian process with zero mean and 
unit variance. We also assume that the feedback link is 
created by an additive white Gaussian noise with K V) « = oT„ 
(a > 0). Due to the practical computation limit, we take 
coding block length n = 30 and power limit P = 10. We 
computed the upper bound of n-block capacity derived 
in our paper and the lower bound (Theorem 2 in [5]) 
for averaging statistic a = 0.1 in the lst-MV channel, as 
shown in Fig. [3] Generally, the plots show that the n-block 
capacity, which is in the region between the upper and 
lower bounds, sharply decreases as a grows. When a 
grows large enough (e.g. O" = 0.8 in Figf3]i, the feedback 
rate-increasing enhancement almost shuts off and, thus, the 
feedback system behaves like a nonfeedback system. Based 
on this observation, we may claim that the n-block capacity 
of the Gaussian channel with noisy feedback is sensitive to 
the feedback noise. 
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Fig. 3. The bounds on C"£™ of the lst-MV channel with a = 0.1. 



(2). 



K J)f! > 043>H„— (I„ +B„)K W) „(I„+B„) — B„K,,„B J7 >0 
H„ I„ + Rl BiT 



K,„ 1 



0„ 



B„ 0„ K v i 



>0 



By taking simple replacements on the original formula, 
the proof is complete. ■ 

B. Proof of Theorem \3\ 

Before showing the proof of the theorem, we need the 
following lemma. 

Lemma 2: Consider the coding scheme as shown in Figf2] 

I(S"\Y"\V")=I{X" ^Y"\V") 



VI. Conclusion 

We have derived an upper bound on the n-block capacity 
of additive Gaussian channels with additive Gaussian noise 
feedback. As it is shown, this n-block upper bound can 
be obtained by solving a convex optimization problem. 
By assuming stationarity on the Gaussian noises, we have 
characterized the limit of the n-block upper bound, which is 
the upper bound of the noisy feedback (shannon) capacity. 

In [15], the authors showed that for strong converse 
finite-alphabet channels with noisy feedback, the capacity 
is characterized by 

Clf e = sup lim -I(X" -> Y"\V") 

x n->°° n 

Therefore, we conjecture that the upper bound characterized 
in this paper should be the true capacity. However, how to 
prove the achievability of this upper bound remains to be 
seen. 

VII. Appendix 

A. The proof of CorollaryU} 
Proof: Let H„ = (I„ + B„)K H ,„(I„ + B„) r + K. v ,„ + 
B„K V ,„B^, we have 

l lo det((I„ +B„)K tl ,„(I n +B„) r +K Vi ) 
2 detK Wj „ 

1 det(H„-B„K Vi X) 
= - log — . 

2 detK H ,„ 

We also have 

fr(K v ,„) < nP & rr(K. s ,„ +B„(K V ,„ +K H ,„)B„ r ) < nP 

& fr(H„ - K,,,X - B„K„.„ - K„,„) < nP. 

Next, we have the following equivalences by applying the 
Schur complement. 



(l).det 



-1 j>T 

B„ H, 



^et^-B^B^detK-J. 



Proof: 



[a] 



I(S n ;Y"\V") 

--h(Y"\V")-h(Y"\S'\V") 

= £ h(Yi\Y<- 1 , V) - hfrlY'- \S'\ V") 
i=i 

£ h(Yi \Y i -\V n )- h (Yi \Y l ~ l ,S", V" X 

i=\ 



lb) 



Y j h{Y i \Y i -\V n )-h{Y i \Y i -\x i ,V n ) 

i=\ 

j^iix'-Y^r- 1 ^") 



=I(X n ^Y n \V") 

(a) follows from the fact that X' can be determined by S' 
and the outputs of the feedback link (i.e. F i_1 + V' -1 ). (b) 
follows from the Markov chain S" - {Y i -\X\V n ) - F;. ■ 

Now, we are ready to give the proof of the theorem. 
Proof: (sketch) Define Cf b ise as formula ®. By the Szego- 
Kolmogorov-Krein theorem, we have 



(-•noise _ 

L.f b — SUp 

{Xj}— stationary 



where the supremum is taken over all stationary Gaus- 
sian process {X,}J1 of the form X\ = Sj + Y!k=i bk(Wi-k + 
Vf-k) where {5/}^ is stationary and independent of 
i{Wi}T =0 ,{Vi}T=o) ™ch that E[X?] < P. 
We first show that 



C noise ^ P^noise 
fb.n -= U /Z> 



(10) 



for all n. Fix n and assume (K*„,B*) achieves Of bn e - Con- 



sider a block-wise white process {5/}^tfel+i> < k < °°, inde 



pendent and identically distributed according to N„(0,K*„). Taking e — > 0, we obtain 



9 ftnoise 
zn ^fb.n 



( %W|v?)+/fer#i|v#0 
=M^|v 1 ")+M5^jv^ 1 )-^w|y^vf)-/z(5^ 1 |y^ lJ y^ 1 ; 

^(SflV^-A^llT.lD-A^ill^i.V^i) 
</j(5f |vf") - A(sf|y 1 2n ,v 1 2 ") 



(Mi 



/(xf-s-yf'ivr) 

^ft^lV^J-ftCwf") 

where (a) and (b) follows from Lemma [2] (c) follows from 
the proof of Theorem [2] By repeating the same argument, 
we have 



C noise 
fb,n 



<Y n {hiJi n \yi")-KW^)) 



for all k. Next, we use the same technical skill as 
[3] to show the inequality ( fTOb . Define the time-shifted 
process {Xi(t)}'*L_ 00 where Xj(t) — X[ +t . Similarly define 

{Yi(t)}7=—, W(0}£=— and Mt)}7=-~- Introduce a ran- 
dom variable T, uniformly distributed over {1,2,3, ••• ,«} 
and independent of everything else. Then it is easy to check 
that {Xi(T),Y i (T),Wi(T),V i (T)}°? = _ a , is jointly stationary. 
Next, we define {Xi,Yj, Wi, V,}°! = _ 00 as a jointly Gaussian 
process with the same mean and autocorrelation as the 
stationary process {Xi(T),Yi(T),Wi(T), V,-(r)}~ _„,. Thus, 



C 



fb,n 



< — {h(Y^{T)\V{ n {T),T) - h ( wf" (r) [ r) ) 
kn 

( = ) -L(/z(y 1 te (r)|^* n ! r) - ft(wf")) 

kn 



1 



<-(h(Y?{T)\Vn-KWn) 

=—(h(fHvf)-h(Wh) 
kn 

where (a) follows from the stationarity assumption on noises 
V and W. Taking k — >• °°, we obtain 

C)fn < h(&\V)-h(W) < C n / b ise 

We now show the main idea of proving the other direction. 
Given e > 0, we let {Xi}°^_ oa achieve C"'/j se - e. Define the 
corresponding channel outputs as {y }^_ 00 . Then, 

= liminf max h(Y?\V?) -h(W?) 

n^oo {Xj}U 

> \imM(h(f? |Vf) - h(W?)) 
= limh(Y{'\vn-Kwn 

n—>°° 

=h(&\r)-h{W) 



liminfCK^ > C n f ° b ' se 

The technical discussion on power constraint is identical to 
that in [3], so we herein omit it. Combined with inequality 
( [Tol l, we know that the limit of C2£" e exists and 

i-™ fmoise J^noise 

-5L f b - n ~ f b ' 
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